Document Image Collection Using Amazon's Mechanical Turk
نویسندگان
چکیده
We present findings from a collaborative effort aimed at testing the feasibility of using Amazon’s Mechanical Turk as a data collection platform to build a corpus of document images. Experimental design and implementation workflow are described. Preliminary findings and directions for future work are also discussed.
منابع مشابه
Paragraph Acquisition and Selection for List Question Using Amazon's Mechanical Turk
Creating more fine-grained annotated data than previously relevent document sets is important for evaluating individual components in automatic question answering systems. In this paper, we describe using the Amazon’s Mechanical Turk (AMT) to judge whether paragraphs in relevant documents answer corresponding list questions in TREC QA track 2004. Based on AMT results, we build a collection of 1...
متن کاملDP2: Distributed 3D image segmentation using micro-labor workforce
SUMMARY This application note describes a new scalable semi-automatic approach, the Dual Point Decision Process, for segmentation of 3D structures contained in 3D microscopy. The segmentation problem is distributed to many individual workers such that each receives only simple questions regarding whether two points in an image are placed on the same object. A large pool of micro-labor workers a...
متن کاملUsing Amazon's Mechanical Turk for Annotating Medical Named Entities.
Amazon's Mechanical Turk (AMT) service is becoming increasingly popular in Natural Language Processing (NLP) research. In this poster, we report our findings in using AMT to annotate biomedical text extracted from clinical trial descriptions with three entity types: medical condition, medication, and laboratory test. We also describe our observations on AMT workers' annotations.
متن کاملCreating Speech and Language Data With Amazon's Mechanical Turk
In this paper we give an introduction to using Amazon’s Mechanical Turk crowdsourcing platform for the purpose of collecting data for human language technologies. We survey the papers published in the NAACL2010 Workshop. 24 researchers participated in the workshop’s shared task to create data for speech and language applications with $100.
متن کاملEstablishing a Database for Studying Human Face Photograph Memory
Contemporary visual environments bombard us with hundreds of face images every day, and this places a nontrivial demand on long-term memory. However, little is known about what makes certain faces remain in our memories, while others are quickly forgotten. To establish a basis for face memorability exploration, we assembled a database of 8,690 face photographs from online sources, spanning dive...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010